Virtual to Real Reinforcement Learning for Autonomous Driving

نویسندگان

  • Yurong You
  • Xinlei Pan
  • Ziyan Wang
  • Cewu Lu
چکیده

Reinforcement learning is considered as a promising direction for driving policy learning. However, training autonomous driving vehicle with reinforcement learning in real environment involves non-affordable trial-and-error. It is more desirable to first train in a virtual environment and then transfer to the real environment. In this paper, we propose a novel realistic translation network to make model trained in virtual environment be workable in real world. The proposed network can convert non-realistic virtual image input into a realistic one with similar scene structure. Given realistic frames as input, driving policy trained by reinforcement learning can nicely adapt to real world driving. Experiments show that our proposed virtual to real (VR) reinforcement learning (RL) works pretty well. To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data. Autonomous driving aims to make a vehicle sense its environment and navigate without human input. To achieve this goal, the most important task is to learn the driving policy that automatically outputs control signals for steering wheel, throttle, brake, etc., based on observed surroundings. The straight-forward idea is end-to-end supervised learning [3, 4], which trains a neural network model mapping visual input directly to action output, and the training data is labeled image-action pairs. However, supervised approach usually requires large amount of data to train a model [31] that can generalize to different environments. Obtaining such amount of data is time consuming and requires significant human involvement. By contrast, reinforcement learning learns by a trial-and-error fashion, and does not require explicit supervision from human. Recently, reinforcement learning has been considered as a promising technique to learn driving policy due to its expertise in action planing [15, 23, 25]. However, reinforcement learning requires agents to interact with environments, and undesirable driving actions would happen. Training autonomous driving cars in real world will cause damages to vehicles and the surroundings. Therefore, most of current research in autonomous driving with reinforcement learning focus on simulations [15, 18, 25] rather than training in real world. While an agent trained with reinforcement learning achieves c © 2017. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms. ar X iv :1 70 4. 03 95 2v 4 [ cs .A I] 2 6 Se p 20 17 2 PAN,YOU,WANG,LU: VIRTUAL TO REAL REINFORCEMENT LEARNING Figure 1: Framework for virtual to real reinforcement learning for autonomous driving. Virtual images rendered by a simulator (environment) are first segmented to scene parsing representation and then translated to synthetic realistic images by the proposed image translation network (VISRI). Agent observes synthetic realistic images and takes actions. Environment will give reward to the agent. Since the agent is trained using realistic images that are visually similar to real world scenes, it can nicely adapt to real world driving. near human-level driving performance in virtual world [18], it may not be applicable to real world driving environment, since the visual appearance of virtual simulation environment is different from that of real world driving scene. While virtual driving scenes have a different visual appearance compared with real driving scenes, they share similar scene parsing structure. For example, virtual and real driving scenes may all have roads, trees, buildings, etc., though the textures may be significantly different. Therefore, it is reasonable that by translating virtual images to their realistic counterparts, we can obtain a simulation environment that looks very similar to the real world in terms of both scene parsing structure and object appearance. Recently, generative adversarial network (GAN) [9] has drawn a lot of attention in image generation. The work by [11] proposed an image-to-image translation network that can translate images from one domain to another using paired data from both domains. However, it is very hard to find paired virtual-real world images for driving, making it difficult to apply this method to our case of translating virtual driving images to realistic ones. In this paper, we propose a realistic translation network to help train self-driving car entirely in virtual world that can adapt to real world driving environment. Our proposed framework (shown in Figure 1) converts virtual images rendered by the simulator to a realistic one and train the reinforcement learning agent with the synthesized realistic images. Though virtual and realistic images have a different visual appearance, they share a common scene parsing representation (segmentation map of roads, vehicles etc.). Therefore, we can translate virtual images to realistic images by using scene parsing representation as the interim. This insight is similar to natural language translation, where semantic meaning is the interim between different languages. Specifically, our realistic translation network includes two modules. The first one is a virtual-to-parsing or virtual-to-segmentation module that produces a scene parsing representation of input virtual image. The second one is a parsing-to-real network that translates scene parsing representations into realistic images. With realistic translation network, reinforcement learning model learnt on the realistic driving data can nicely apply to real world driving. To demonstrate the effectiveness of our method, we trained our reinforcement learning model by using the realistic translation network to filter virtual images to synthetic realistic PAN,YOU,WANG,LU: VIRTUAL TO REAL REINFORCEMENT LEARNING 3 images and feed these realistic images as state inputs. We further compared with supervised learning and other reinforcement learning approaches that use domain randomization [22]. Our experiments illustrate that a reinforcement learning model trained with translated realistic images has better performance than reinforcement learning model trained with only virtual input and virtual to real reinforcement learning with domain randomization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Autonomous Driving in Reality with Reinforcement Learning and Image Translation

Supervised learning is widely used in training autonomous driving vehicle. However, it is trained with large amount of supervised labeled data. Reinforcement learning can be trained without abundant labeled data, but we cannot train it in reality because it would involve many unpredictable accidents. Nevertheless, training an agent with good performance in virtual environment is relatively much...

متن کامل

Simulated Autonomous Driving on Realistic Road Networks using Deep Reinforcement Learning

Using Deep Reinforcement Learning (DRL) can be a promising approach to handle various tasks in the field of (simulated) autonomous driving. However, recent publications mainly consider learning in unusual driving environments. This paper presents Driving School for Autonomous Agents (DSA2), a software for validating DRL algorithms in more usual driving environments based on artificial and reali...

متن کامل

Control Behavior of 3D Humanoid Animation Object Using Reinforcement Learning

The ability to learn is a potentially compelling and important quality for interactive 3D human avatars or virtual humans. To that end, we describe a practical approach to real-time learning for 3D virtual humans. Our implementation is grounded in the techniques of reinforcement learning and informed by insights from avatar’s behavior training. It simulates the learning task for characters by e...

متن کامل

Towards personalized human AI interaction - adapting the behavior of AI agents using neural signatures of subjective interest

Reinforcement Learning AI commonly uses reward/penalty signals that are objective and explicit in an environment – e.g. game score, completion time, etc. – in order to learn the optimal strategy for task performance. However, Human-AI interaction for such AI agents should include additional reinforcement that is implicit and subjective – e.g. human preferences for certain AI behavior – in order...

متن کامل

Transferring Autonomous Driving Knowledge on Simulated and Real Intersections

We view intersection handling on autonomous vehicles as a reinforcement learning problem, and study its behavior in a transfer learning setting. We show that a network trained on one type of intersection generally is not able to generalize to other intersections. However, a network that is pre-trained on one intersection and fine-tuned on another performs better on the new task compared to trai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1704.03952  شماره 

صفحات  -

تاریخ انتشار 2017